Corona prepping using Finnish data regression using a categorical boosting forest algorithm

Main question: at this point we're interested in one single classification, i.e. what predicts whether people do maskless contacts with non-householders

Research Document

Questions codebook

Method of delivery

Virtual Environments and Packages

Read in data, show info and data head

Specify the feature list, grouping variable, and specify the grouping variable as a categorical variable

EDA on the target

Check the amount of samples in the target

Force all feature variables to categorical data